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Abstract. In computer vision, we have the problem of creating graphs 
out of unstructured point-sets, i.e. the data graph. A common approach 
for this problem consists of building a triangulation which might not al¬ 
ways lead to the best solution. Small changes in the location of the points 
might generate graphs with unstable configurations and the topology of 
the graph could change significantly. After building the data-graph, one 
could apply Graph Matching techniques to register the original point-sets. 
In this paper, we propose a data graph technique based on the Minimum 
Spanning Tree of Maximum Entropty (MSTME). We aim at a data graph 
construction which could be more stable than the Delaunay triangula¬ 
tion with respect to small variations in the neighborhood of points. Our 
technique aims at creating data graphs which could help the point-set reg¬ 
istration process. We propose an algorithm with a single free parameter 
that weighs the importance between the total weight cost and the entropy 
of the current spanning tree. We compare our algorithm on a number of 
different databases with the Delaunay triangulation. 


1 Introduction 

The problem of point-set registration often involves the construction of the so- 
called data graph mm, which is the graph created out of unstructured point- 
sets. One possibility for registering point-sets using graphs is to formulate the 
problem as Graph Matching . We are interested in exploiting some topolog¬ 
ical properties of data graphs which would alleviate the registration procedure. 
For instance, we could design a data-graph which would decrease the search 
space when matching points. 

The Iterative Closest Point (ICP) algorithm |3J is a powerful solution for the 
registration of point-sets. Such an algorithm would try to find for each point in 
one set the closest point in the other set. Therefore, we could avoid checking all 
points if we have some prior knowledge. For instance, if we could state that the 
point we would like to match has a certain degree value, we could evaluate only 
the points which match this criteria and greatly reduce the search space for the 
registration. This would be true for a non regular data-graph [^] otherwise we 
would still need to search the whole space of points. 

We would like to obtain a graph with high variability in the degree dis¬ 
tribution. This high variability can be encoded by the entropy of the degree 
distribution. In [5], we have performed the registration of point-sets via the 
Min-Weight Max-Entropy problem in which the generated edge-induced sub¬ 
graph had the maximum entropy. In this work, we constrain our edge-induced 
subgraph to be a tree. Our problem is called the Minimum Spanning Tree with 

1 A graph in which all nodes have the same degree value. 
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Maximum Entropy (MSTME). By minimizing the total edge weight, we can 
allow some robustness of the data graph with respect to small deviations of 
the points, by maximizing the entropy, we increase the variability of the degree 
distribution which could later ease the registration process. We evaluate the 
current approach under different levels of noise and compare the stability of the 
data-graph with a Delaunay triangulation. 

The remainder of this paper is organized as follows: The related work is pre¬ 
sented in Section [2] Section [3] discusses our objective function and explains how 
the entropy is measured in the graph. Our optimization strategy is explained 
in details in Section [4j Our experimental setup is disclosed in Section [5] Our 
conclusions and future work are available in Section |6] 

2 Related Work 

A data graph can be obtained in a variety of ways, it is common to use the 
Delaunay triangulation for the registration of point-sets mm®. The Delaunay 
triangulation is based on the condition that no other point should lie inside 
the circumcircle of any triangle. It is the dual of the Voronoi diagram which 
partitions the embedding space into regions closest to the point set. In this 
work, we are working on suitable alternatives which would allow more stability 
during the registration. 

Our cost function is built upon a Minimum Spanning Tree formulation and 
entropy maximization. In this sense, our work is closely related to |3] as the 
authors also formulate an optimization problem focusing on the entropy of a 
Minimum Spanning Tree. However, they are dealing with a different problem 
by proposing an entropy estimator for clustering. In our work, we are interested 
on the data graph construction problem and our cost function differs from theirs 
since our entropy is calculated on the degree distribution of the generated MST 
while they estimate the data entropy based on the length of the spanning tree. 

Our problem can also be seen as a bi-criteria optimization problem. Ravi and 
Goemans m proposed a Lagrangian relaxation to solve the constrained mini¬ 
mum spanning tree problem, which is also a bi-criteria optimization proven to 
be NP-Hard by Aggarwal et al. pQ. Neighborhood search and adjacency search 
heuristics for the bicriterion minimum spanning tree problem were proposed 
by Andersen et al. [2]. 

3 Objective Function 

In order to measure the diversity found in the degree distribution, we can cal¬ 
culate the Shannon entropy (H) for a graph G(V, E ) as follows: 

H (G) = -J2 p{v)log 2 (p{v)). (1) 

VEV^ 

where p(v) is the probability of finding a node with a degree of v among all 
distinct degree values (Vy is the set of distinct degree values of V"). The entropy 
measures the uncertainty associated with a random variable. As defined in Eq.[Tj 
a high entropy H(G) of a graph G(V, E ) would indicate a high “variability” in 
the distribution of nodes V. The converse is also true, a low entropy means low 
variability, as in a fc-regular graph whose probability p(k) = 1 and log( 1) = 0. 

Given a point-set X , we calculate the distance from each point to all the 
other points and store those distances in W. This generates a complete graph 
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a) MST 

H(MST) = 0.72 


b) MSTME (A = 1) c) Delaunay 

H(MSTME) = 1.37 H(Delaunay) = 0.72 


Figure 1: The first graph (a) is obtained with a Minimum Spanning Tree (MST) 
and has cost c(MST) = 4 and H(MST) = 0.72. Graph (b) is obtained using 
our approach of MST with Maximum Entropy (MSTME). Since H(MSTME) 
= 1.37, the cost c(MSTME, A = 1) = 3.04, hence, c(MST) < c(MSTME). The 
last graph (c) is a Delaunay triangulation and it has the same entropy as the 
MST. 


K\ x \(X,W). We seek to find a vector U £ {0,1}I 1V I in such a way that when 
Ui = 1, we add IT,: to our graph. We can formulate our cost function simply as: 

II w\ 

minimize ^ WiUi — A(H(G[t/])) 

U i =1 

subject to UG{ 0,1} |w/| , ( 2 ) 

Eu = \w\-i, 

G[U] is connected. 

where G[U ] is an edge-induced subgraph from vector U. A is a parameter which 
weighs the importance between the two terms: minimizing the total edge and 
maximizing the graph entropy. The first constraint states that we have a discrete 
problem, i.e. we either add an edge (Ui = 1) or we do not add it. The second 
constraint is a necessary condition for our graph to be a tree, however, we need 
the third constraint to force the graph to be connected. 

Figure [ljdisplays three different graphs of the same point-set X = {a, b, c, d , e}. 
The first graph (Figure [TJl) is simply a Minimum Spanning Tree (MST) whose 
cost is c(MST , A = 0) = 4. It could be obtained by setting A = 0 and therefore 
the entropy part vanishes from Equation [2] The second graph (Figure |T|d) now 
considers A = 1, a higher entropy would be able to decrease the cost of solution 
in comparison with the MST, the cost is c(MSTME,A = 1) = 3.04. Notice that 
although the edge weight cost is higher, the total cost was decreased due to 
the higher entropy. Finally, we show in contrast that calculating a Delaunay 
triangulation yields an entropy as high as the MST one. 


4 Optimization for the MSTME Problem 

Our optimization is based on the following strategy: we pose the MSTME as a 
capacity problem with respect to the edge set. At each iteration, we increase the 
capacity of the edges, and we add the one whose cost is minimum based on our 
cost function which evaluates the weight and entropy of the current generated 
MST. 
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The input is a complete graph K\ x \{X, W) composed of the point-set as the 
vertices and the edges are weighted according to the Euclidean distance. For 
each edge e, we check if the subgraph G[U\ induced by our vector U contains 
a cycle by the addition of such an edge, in which case, we discard (line 8). 
Otherwise, we calculate the new cost obtained by the weight of e and we estimate 
the entropy of the MST (line 12). In line 16, we remove this edge from our graph 
and continuing searching within the same capacity for an edge which could yield 
a smaller cost. At the end, we add the edge with the best cost an increase the 
capacity of our MST up to \X\ — 1 (line 17). 


Data: K\ x \(X,W), A; 

Result: G[U]; 

1 begin 

2 U i- {}; 

3 // We increase the capacity up to \X\ — 1 

4 for i = 0 to \X\ - 1 do 

5 cost <— oo; 

6 best ■£- 0; 

7 foreach e £ E\U do 

8 if G[U U {e}] has cycle then 

9 continue; 

10 else 

11 |_ f/<-PU{e}; 

12 ecost = W e — AH(G[U]) 

13 if ecost < cost then 

14 cost ■£- ecost ; 

is best ■£- e; 


16 

17 


_ U <- U- {e}; 
U ■£- U U {best}; 
return G[U}; 


Algorithm 1: The Minimum Spanning Tree with Maximum Entropy. 


5 Experiments 

Our experimental section focus on evaluating the stability of our algorithm on 
point-sets sampled from the silhouette of objects. Figure [2] shows some point- 
sets connected using the MSTME. It is noticeable that the entropy takes over in 
thin areas, i.e. the degree values of the points are higher in regions such as the 
top of the fish, the feet of the dog, while other regions follow the silhouette of 
the object, since it is too costly to make an edge cross the shape of the object. 

Our noise experiment is built upon the shortest pairwise distance (e) in the 
point-set. Then, we perburb each point x £ X by a length and an angle {l, a), 
where a = [0,360), and the radius l = [0, r.e] and r varies from [1,10], e.g. 5e 
means that each point of the dataset was perturbed within a radius of 5 times 
the shortest pairwise distance in the graph. Certainly, the higher the noise, the 
lower the stability of the algorithms will be. 

Figure [3] shows the impact of noise in the point-sets varying according to 
e in a cow silhouette. It is not an easy task even for humans to match such 
points (e.g. feet). It is worth mentioning that as the algorithm searches for the 
optimal, it has to decide on which edges to take, therefore, a simple wrong edge 
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Figure 2: Data graphs connected using the MSTME (A = 0.5). 
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Noise up to 3e Noise up to 5e Noise up to 8e Noise up to lOe 
Figure 3: Noise added to the point-set. 


will cause a much higher impact on the results when compared with Delaunay 
triangulation, which has many more edges. 

As our algorithm outputs a tree, it cannot contain cycles, therefore, the 
silhouette will “break” in locations where the weights of the edges are high, and 
the split point will vary according to the noise. Notice that the split occurred 
in the face of the cow when noise level was lOe but in level 8e it happened at a 
different location. 

Figure [4] shows the quantitative analysis of stability between MSTME and 
Delaunay triangulation. The way we evaluate the stability is by calculating the 
percentage of edges which remained stable in all graphs within the same noise 
level. The first graph shows the results for the fish dataset^] The y -axis displays 
the percentage of stable edges, e.g. 0.8 means that 80% of edges were found 
across all data-graphs within the same level. For this dataset, our algorithm 
for the MSTME was always more stable than the Delaunay triangulation up to 
the highest noise level. The boxplot displays several other types of information, 
such as the median, quartiles, etc. The instability observed was found on the 
thin parts of the fish, such as the dorsal and caudal fins. The second plot shows 
the results on the cowl datasetE] 

We observed that the MSTME was, in general, more stable than the De¬ 
launay algorithm. Small variations in the location of the nodes affected the 
triangulation more than it affected the MSTME. As the triangulation generates 
many more edges, a single wrong edge will have a high impact on the results 
of the MSTME. Nevertheless, it was still able to obtain higher stability than 
Delaunay up to a certain level of noise. 


2 The fish dataset was obtained from the Coherent Point Drift [9j. 
3 the cowl dataset was obtained from the kimia99 dataset m 
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Fish - MaxEntropyMST vs Delaunay 
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Cowl - MaxEntropyMST vs Delaunay 
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Figure 4: Boxplot comparison of the stability on the fish dataset. The noise 
consists of shifting each individual point by ex level, varing from 1 to 10. For 
each noise level, we generated 30 perturbed datasets and compute the number 
of stable edges accross all results. 


6 Conclusions 

In this paper, we proposed a data-graph technique by searching for a tree whose 
cost is minimum regarding the total edge weight and maximum regarding en¬ 
tropy of the degree distribution. We called this problem the Minimum Spanning 
Tree of Maximum Entropy (MSTME). In point-sets sampled from the silhou¬ 
ette of an object, we observed that points lying in thin areas were often unstable 
when there was noise. After analyzing and comparing the robustness of the gen¬ 
erated graph, we notice that the data-graph generated by the MSTME was more 
stable than the one by the Delaunay triangulation. 

In the future, we would like to propose an approximation algorithm which 
could guarantee bounds on the MSTME problem. Moreover, we will apply 
this data-graph in a framework similar to ICP and evaluate the accuracy of 
the results obtained as well as comparing the results using other data-graph 
techniques. 
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